Below is a CNN trained on Alien and Predator images obtained form: https://www.kaggle.com/pmigdal/alien-vs-predator-images

  • Dataset contains 347 test samples of class Alien and 347 test samples of class Predator
In [2]:
import os
import matplotlib.pyplot as plt
import numpy as np
import cv2
import random
from sklearn.utils import shuffle

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, Dropout, MaxPooling2D
In [3]:
class_names_label = {'alien':0, 'predator':1}
In [4]:
# Preparing data
def load_data():
    datasets = ['alien-vs-predator-images/data/train', 'alien-vs-predator-images/data/test']
    size = (150,150)
    output = []

    for dataset in datasets:
        images = []
        labels = []
        
        for class_name in os.listdir(dataset):
            curr_label = class_names_label[class_name]
            
            for img in os.listdir(dataset + '/' + '/' + class_name):
                img_path = dataset + '/' + '/' + class_name + '/' + img
                curr_img = cv2.imread(img_path)
                curr_img = cv2.resize(curr_img, size)
                images.append(curr_img)        
                labels.append(curr_label)
                
        images = np.asarray(images, dtype='float32')
        labels = np.asarray(labels, dtype='float32')
        images, labels = shuffle(images, labels)
                
        output.append((images, labels))
        
    return output
In [5]:
(x_train, y_train), (x_test, y_test) = load_data()
x_train /= 255
x_test /= 255
In [23]:
# Exploring Data
sample = [0, 10, 40, 193, 333, 412, 252, 14, 600]
fig = plt.figure()
plt.subplots(3,3, figsize=(16,8))

for i in range(9):
    plt.subplot(3,3,i+1)
    if y_train[sample[i]] == 0:     
        title = 'Alien'
    else:
        title = 'Predator'
    plt.title(title)
    plt.tight_layout()
    plt.imshow(x_train[sample[i]])
<Figure size 432x288 with 0 Axes>
In [7]:
# Building Model

model = Sequential()
model.add(Conv2D(32,(3,3), activation='relu', input_shape = (150, 150, 3)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32,(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32,(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1,activation='sigmoid'))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
In [29]:
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 32)        9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 32)        0         
_________________________________________________________________
flatten (Flatten)            (None, 9248)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               1183872   
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 129       
=================================================================
Total params: 1,203,393
Trainable params: 1,203,393
Non-trainable params: 0
_________________________________________________________________
In [30]:
history = model.fit(x_train, y_train, epochs=20)
Train on 694 samples
Epoch 1/20
694/694 [==============================] - 14s 20ms/sample - loss: 0.7175 - accuracy: 0.5548
Epoch 2/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.6135 - accuracy: 0.6599
Epoch 3/20
694/694 [==============================] - 13s 18ms/sample - loss: 0.5671 - accuracy: 0.7161
Epoch 4/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.5275 - accuracy: 0.7522
Epoch 5/20
694/694 [==============================] - 13s 19ms/sample - loss: 0.5118 - accuracy: 0.7507
Epoch 6/20
694/694 [==============================] - 13s 19ms/sample - loss: 0.4777 - accuracy: 0.7968
Epoch 7/20
694/694 [==============================] - 13s 19ms/sample - loss: 0.4284 - accuracy: 0.8098
Epoch 8/20
694/694 [==============================] - 13s 18ms/sample - loss: 0.3992 - accuracy: 0.8271
Epoch 9/20
694/694 [==============================] - 13s 18ms/sample - loss: 0.3768 - accuracy: 0.8314
Epoch 10/20
694/694 [==============================] - 13s 19ms/sample - loss: 0.3262 - accuracy: 0.8660
Epoch 11/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.2961 - accuracy: 0.8818
Epoch 12/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.2452 - accuracy: 0.9078
Epoch 13/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.1925 - accuracy: 0.9265
Epoch 14/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.1691 - accuracy: 0.9366
Epoch 15/20
694/694 [==============================] - 12s 18ms/sample - loss: 0.1459 - accuracy: 0.9467
Epoch 16/20
694/694 [==============================] - 12s 17ms/sample - loss: 0.1209 - accuracy: 0.9597
Epoch 17/20
694/694 [==============================] - 12s 17ms/sample - loss: 0.0985 - accuracy: 0.9640
Epoch 18/20
694/694 [==============================] - 12s 17ms/sample - loss: 0.0730 - accuracy: 0.9841
Epoch 19/20
694/694 [==============================] - 12s 17ms/sample - loss: 0.0706 - accuracy: 0.9798
Epoch 20/20
694/694 [==============================] - 12s 17ms/sample - loss: 0.0519 - accuracy: 0.9841
In [31]:
model.evaluate(x_test, y_test)
200/1 [====================================] - 1s 6ms/sample - loss: 0.6327 - accuracy: 0.7750
Out[31]:
[0.8183332431316376, 0.775]
In [ ]:
model.save('AVPmodel_1.h5')
In [108]:
#Looking at misclassified images

predicted = np.round(model.predict(x_test),0).reshape(200,)
misclassified = predicted != y_test

plt.figure()
plt.subplots(1,4, figsize=(16,8))

misclassified_samples = np.arange(1,5,1)
for i in misclassified_samples:
    if y_test[i] == 0:     
        title = 'Alien'
    else:
        title = 'Predator'
    plt.subplot(1,4,i)
    plt.imshow(x_test[misclassified][i])
    plt.tight_layout()
    plt.title(title)
<Figure size 432x288 with 0 Axes>

RESULTS

  • Accuracy on training set: 98.4%
  • Accuracy on test set: 77.5%

DISCUSSION

The model overfit on training data and is not able to generalize as well to the test set. The model architecture chosen has 3 convolutional layers making it deep with a total of 1.2M trainable parameters. Such a deep CNN is susceptible to overfitting with only 694 samples.

However, a dropout layer was introduced to reduce overfitting. Also, images were downscaled to 150 x 150 to help combat overfitting.

Improvements

  • Model capacity could be reduced by removing layers (This was attempted with minimal success)
  • Training set can be augmented using a data generated to distort and transform images allowing better generalization
  • 694 training samples may be a small dataset for a CNN of this depth. More data would help with generalization
  • Upon analysis of dataset quality; data could be better represented with more diversity of colour images and types of images (sketch vs CGI)
  • Add weight regularizers